Learning discriminative temporal patterns in speech: development of novel TRAPS-like classifiers
نویسندگان
چکیده
Motivated by the temporal processing properties of human hearing, researchers have explored various methods to incorporate temporal and contextual information in ASR systems. One such approach, TempoRAl PatternS (TRAPS), takes temporal processing to the extreme and analyzes the energy pattern over long periods of time (500 ms to 1000 ms) within separate critical bands of speech. In this paper we extend the work on TRAPS by experimenting with two novel variants of TRAPS developed to address some shortcomings of the TRAPS classifiers. Both the Hidden Activation TRAPS (HATS) and Tonotopic MultiLayer Perceptrons (TMLP) require 84% less parameters than TRAPS but can achieve significant phone recognition error reduction when tested on the TIMIT corpus under clean, reverberant, and several noise conditions. In addition, the TMLP performs training in a single stage and does not require critical band level training targets. Using these variants, we find that approximately 20 discriminative temporal patterns per critical band is sufficient for good recognition performance. In combination with a conventional PLP system, these TRAPS variants achieve significant additional performance improvements.
منابع مشابه
Traps - Classifiers of Temporal
TRAPS CLASSIFIERS OF TEMPORAL PATTERNS Hynek Hermansky1;2 Sangita Sharma1 1Oregon Graduate Institute of Science and Technology, Portland, Oregon , USA. 2International Computer Science Institute, Berkeley, California, USA. Email: hynek,[email protected] ABSTRACT The work proposes a radically di erent set of features for ASR where TempoRAl Patterns of spectral energies are used in place of the ...
متن کاملTRAPS - classifiers of temporal patterns
The work proposes a radically di erent set of features for ASR where TempoRAl Patterns of spectral energies are used in place of the conventional spectral patterns. The approach has several inherent advantages, among them robustness to stationary or slowly varying disturbances.
متن کاملDistributed Speech Recognition Usin Traps-estimated Manne
In this paper, we investigate the use of TemPoRal PatternS (TRAPS) classifiers for estimating manner of articulation features on the small-vocabulary Aurora-2002 database. By combining a stream of TRAPS-estimated manner features with a stream of noise-robust MFCC features (earlier proposed in the Aurora-2002 evaluation by OGI, ICSI and Qualcomm), we obtain an average absolute improvement of 0.4...
متن کاملA Paradigm for Limited Vocabulary Speech Recognition Based on Redundant Spectro-Temporal Feature Sets
Speech recognition techniques have come to rely almost completely on HMM based frameworks. In this paper, we present a novel paradigm for small-vocabulary speech recognition based on a recently proposed word spotting technique. Recent work using discriminative classifiers with ordered spectro-temporal features to detect the presence of keywords obtained encouraging improvements over HMM-based m...
متن کاملTemporal Patterns ( Traps ) in Asr of Noisy
Phoenix, Arizona, USA, March 1999. TEMPORAL PATTERNS (TRAPS) IN ASR OF NOISY SPEECH Hynek Hermansky1;2 and Sangita Sharma1 1Oregon Graduate Institute of Science and Technology, Portland, Oregon , USA. 2International Computer Science Institute, Berkeley, California, USA. Email: hynek,[email protected] ABSTRACT In this paper we study a new approach to processing temporal information for automat...
متن کامل